Goto

Collaborating Authors

 learnable visual marker


Learnable Visual Markers

Neural Information Processing Systems

We propose a new approach to designing visual markers (analogous to QR-codes, markers for augmented reality, and robotic fiducial tags) based on the advances in deep generative networks. In our approach, the markers are obtained as color images synthesized by a deep network from input bit strings, whereas another deep network is trained to recover the bit strings back from the photos of these markers. The two networks are trained simultaneously in a joint backpropagation process that takes characteristic photometric and geometric distortions associated with marker fabrication and capture into account. Additionally, a stylization loss based on statistics of activations in a pretrained classification network can be inserted into the learning in order to shift the marker appearance towards some texture prototype. In the experiments, we demonstrate that the markers obtained using our approach are capable of retaining bit strings that are long enough to be practical. The ability to automatically adapt markers according to the usage scenario and the desired capacity as well as the ability to combine information encoding with artistic stylization are the unique properties of our approach. As a byproduct, our approach provides an insight on the structure of patterns that are most suitable for recognition by ConvNets and on their ability to distinguish composite patterns.


Reviews: Learnable Visual Markers

Neural Information Processing Systems

Briefly, the overall idea of serializing the whole encoding--environment simulation--recognition into an end-to-end network is interesting and worth exploration. The presentation is clear and comprehensive. To me, however, there is still large room to fulfill the idea and its application possibility. Specifically, more experiments should have been done to support and demonstrate the idea. The end-to-end learning is intuitive and proved effective, at least qualitatively, but the model structure binds the synthesizer and the recognizer together, assuming that we already know the recognizer in the first place. In real situations, the two parts are mostly fully decoupled, and this structure limits its applications.


Learnable Visual Markers

Grinchuk, Oleg, Lebedev, Vadim, Lempitsky, Victor

Neural Information Processing Systems

We propose a new approach to designing visual markers (analogous to QR-codes, markers for augmented reality, and robotic fiducial tags) based on the advances in deep generative networks. In our approach, the markers are obtained as color images synthesized by a deep network from input bit strings, whereas another deep network is trained to recover the bit strings back from the photos of these markers. The two networks are trained simultaneously in a joint backpropagation process that takes characteristic photometric and geometric distortions associated with marker fabrication and capture into account. Additionally, a stylization loss based on statistics of activations in a pretrained classification network can be inserted into the learning in order to shift the marker appearance towards some texture prototype. In the experiments, we demonstrate that the markers obtained using our approach are capable of retaining bit strings that are long enough to be practical.